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ABSTRACT 



Weak gravitational lensing provides a direct statistical measure of the dark matter 
distribution. The variance is easiest to measure, which constrains the degenerate prod- 
uct o-80[]-6. The de generacy is broken by measuring the skewness arising from the fact 
that densities must remain positive, which is not possible when the initially symmetric 
perturbations become non-linear. Skewness measures the non-linear mass scale, which 
in combination with the variance measures Q.q directly. We present the first detection 
of dark matter skewness from the Virmos-Decart survey. We have measured the full 
three point function, and its projections onto windowed skewness. We separate the 
lensing mode and the B mode. The lensing skewness is detected for a compensated 
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Gaussian on scales of 5.37 arc minutes to be = 1.06 ± 0.06 x 10 ^. The B-modes 

are consistent with zero at this scale. The variance for the same window function is 
k2 = 5.32 ±0.62 ±0.98 x 10"^ resulting in 5*3 = 375+124- Comparing to N-body simula- 
tions, we find < 0.5 at 90% confidence. The Canada-France-Hawaii- Telescope legacy 
survey and newer simulations should be able to improve significantly on the constraint. 

Subject headings: Cosmology-theory-simulation-observation: gravitational lensing, dark 
matter, large scale structure 

1. Introduction 

A direct measurement of the mass distribution in the universe has been a major challenge 
and focus of modern cosmology. The deflection of light by gravity results in the gravitational 
lensing effect, which is a direct measure of the strength of gravitational clustering. Recently several 
groups have been able to measure this weak gravitational lensing effect (Bacon et al. 2002; Refregier 
et al. 2002; Hoekstra et al. 2002; Van Waerbeke et al. 2002; Jarvis et al. 2002; Brown et al. 2002; 
Hamana et al. 2002b). Due to the weakness of the effect, all detections have been statistical in 
nature, primarily in regimes where the signal-to-noise is less than unity. 

One of the degeneracies in the measurement of the power spectrum is that between the normal- 
ization of the amplitude of fractional fluctuations parameterized by cjg, the linearly extrapolated 
standard deviation in spheres of radius 8h~'^ Mpc, and the present day density of matter O^q. One 
measures the fluctuations in the projected mass density k, which can either arise by a large to- 
tal mass density with small fluctuations, or a smaller mean mass density with larger fractional 
fluctuations. Typically one constrains the combinations (78rio^^(Van Waerbeke et al. 2002). A 
very similar degeneracy arises in all dynamical measurements of mass, including redshift space 
distortions (Kaiser 1987), cluster abundance (Pen 1998), galaxy peculiar velocities, and pairwise 
velocities (Davis & Peebles 1983). 

In the standard model of cosmology, fluctuations start off small, symmetric and Gaussian. 
As fluctuations grow by gravitational instability, this symmetry can no longer be maintained: 
overdensities can be arbitrarily large, while under dense regions can never have less than zero mass. 
This leads to a skewness in the distribution of matter fluctuations. The effect has been studied 
theoretically (Bernardeau et al. 1997) and initial detections have been reported (Bernardeau et al. 
2002). In second order perturbation theory, one finds that the skewness scales as the square of 
the variance, and inversely to density. In terms of the dimensionless surface density k and source 
redshift Zg, one has 

53 . ^ OC n^^Z--^^. (1) 

5*3 does not depend on the power spectrum normalization, but does require knowledge of the source 
distribution. 
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The detection of this effect in real data is challenging. The shear field is only sampled at highly 

irregularly spaced galaxy positions. Just to accurately measure the effective windowed variance of 
the density field one needs to resort to the two point correlation function (Pen et al. 2002) which 
can be done cheaply computationally, or through an optimal estimator which requires sophisticated 
algorithms for the large data sets (Padmanabhan et al. 2002). For low signal-to-noise data, the 
two point correlation function is the optimal procedure to measure windowed variances and power 
spectra. 

To measure the skewness of a distribution sampled on an irregular pattern requires measuring 
the three point function. This is itself a computationally complex tasks for the spin 2 weak lens- 
ing shear field. Bernardeau et al. (2003) explored a simplified approach using specific geometrical 
configurations in the shear pattern. They applied their method to the Virmos-Descart data and 
found the amplitude and the shape of their 3-point shear correlation function over angular scales 
ranging from one to five arc-minutes follow theoretical expectation of popular cosmological CDM 
models. Although their detection is strong (4.9 a), the cosmological interpretation of non-linear 
features together with the properties of the 2-point shear correlation function is difficult. In partic- 
ular, one does not know yet how it can be used to break the ftQ-a^ degeneracy, as the skewness of 
the convergence field can do. Hence, despite recent theoretical investigations of the 3-point shear 
correlation function (Schneider &; Lombardi 2002; Zaldarriaga & Scoccimarro 2002; Takada & Jain 
2002), the projection of the three point function onto the skewness of the smoothed convergence is 
still the easiest and most direct way to constrain cosmological scenario with cosmic shear data. 

In this paper we present the first measurement of dark matter skewness from the Virmos- 
Descart survey^. Despite the strong inhomogeneities in the galaxy distribution produced by the 
masking, we found an optimal weighting scheme that allows us to compute a robust and reli- 
able skewness estimator. The skewness and variances are computed with the same compensated 
Gaussian smoothing window, and a full separation into modes has been accomplished. We have 
performed N-body simulations to calibrate the results. 

2. Data 

The Virmos-Descart data consist in four uncorrelated patches (referred as field F02, FIO, F14 
and F22 according to their RA position) of about 4 square-degrees each and separated by more 
than 40 degrees. The fields have been observed with the CFH12k panoramic CCD camera, mounted 
at the Canada-France-Hawaii Telescope prime focus, over the periods between January 1999 and 
November 2001. 

The observations and data reduction have been described in previous Virmos-Descart cosmic 
shear papers (Van Waerbeke et al. 2000, 2001, 2002). A detailed discussion on the data quality will 
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be presented elsewhere (McCracken et al in preparation), but most of the data properties relevant 
for cosmic shear are already presented in Van Waerbeke et al. (2000, 2001, 2002). Shortly, the 
observations have been done with the I-band filter available on the CFH12k camera with typical 
exposure time of one hour. The median seeing of the data set is 0.85 arc-second and the limiting 
depth is Iab = 24.5. All data were processed at the Terapix data center^. Although the data 
observed during that period cover 11.5 sq-degrees, a significant fraction of the field is lost by the 
masking procedure, so the final area covered by the data discussed in this work is 8.5 sq-degrees. 

Catalog construction, galaxy selection, PSF anisotropy correction and shape measurements 
are presented in Van Waerbeke et al. (2000, 2001, 2002). The final cosmic shear catalog contains 
392,055 galaxies with magnitude Iab > 22 with median Iab =23.6. Several careful checks have 
demonstrated that systematic residuals are very small. However, Van Waerbeke et al. (2001, 2002) 
and Pen et al. (2002) have shown that a B-mode signal still remains on scales larger than 10 arc- 
minutes. Its origin is not yet understood. Below that angular scale, its amplitude is smaller than 
the E-mode, confirming the gravitational nature of the signal. Van Waerbeke et al. (2001, 2002) 
and Bcrnardeau et al. (2002) have used these catalogs to measure the 2-point and 3-points shear 
correlation functions and constrain cosmological models. 

The method can be sumarized as the followingVan Waerbeke et al. (2001): The HDF samples 
are complete up to the Iab = 27**^ magnitude, and can therefore be used to calibrate the redshift 
distribution of our VIRMOS sample. In practice, we identify each redshift bin rim. (z) of the HDF's 
corresponding to each magnitude bins rrii (magnitude bins have a fixed width of Am = 0.5). The 
redshift distribution of the VIRMOS sample is estimated by adding up the rimiiz) with a proper 
weighting Wmi- The weight corresponds to the ratio of number counts, per magnitude bin, of 
the VIRMOS and the HDF catalogues. The completeness of VIRMOS (which occurs 2.5 magnitudes 
brighter than the HDF) is then properly taken into account. 

The resulting histogram for this sample taken from the HDF is shown in Figure 1. In the 
absence of any spectroscopic survey deeper than Iab = 22 this is the best redshift estimate at the 
moment. In the future, any accurate cosmological parameter interpretation from the skewness will 
require precise determination of the source redshift. 

3. Simulations 
3.1. N-body 

Both the variance and skewness arise from non-linear scales, and simulations are required to 
calibrate theory (White & Hu 2000; Jain et al. 2000). The past studies had shown that Gaussian 
treatments are not accurate for error bar estimates. The sample variance depends on the actual 
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Fig. 1. — The histogram shows the redshift distribution of the galaxies with Ij^b > 22 with their 
appropriate noise weighting. It was computed from the photometric redshifts of the Hubble Deep 
Fields north and sourth (see Van Waerbeke et al. (2001)). 
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survey geometry, for which simulations are the only tractible approach to quantify these effects. 

We ran a series of N-body simulations with different values of J7o to generate convergence maps 
and make simulated catalogs to calibrate the observational data and estimate errors in the analysis. 
The power spectra for given parameters were generated using CMBFAST (Seljak & Zaldarriaga 
1996) and these tabulated functions were used to generate initial conditions. The power spectra were 
normalized to be consistent with the earlier two point analysis from this data set (Van Waerbeke 
et al. 2002). We ran all of the simulations using a parallel, Particle-mesh N-body code (Dubinski, 
J., Kim, J., Park, C. 2003, in preparation) at 1024^ mesh resolution using 512^ particles on an 
8 node quad processor Itanium Beowulf cluster at CITA. Output times were determined by the 
appropriate tiling of the light cone volume with joined co-moving boxes from z ~ 6 to z = 0. We 
output periodic surface density maps at 2048^ resolution along the 3 independent directions of the 
cube at each output interval. These maps represent the raw output for the run and are used to 
generate convergence maps in the thin lens and Born approximations by stacking the images with 
the appropriate weights through the comoving volume contained in the past light cone. 

All simulations started at an initial redshift Zi = 50, and ran for 1000 steps in equal expansion 
factor ratios, with box size L = 200h~^ Mpc comoving. We adopted a Hubble constant h = 0.7 
and a scale invariant n = 1 initial power spectrum. A flat cosmological model with J7o + = 1 was 
used. Four models were run, with Qq of 0.2, 0.3, 0.4 and 1. The power spectrum normalization ug 
was chosen as 1.16, 0.90, 0.82 and 0.57 respectively. 

We used the simulations in two modes. To study scaling relations and quick analyses, we 
projected them to convergence maps which were all analyzed with idealized noise on a square 
domain. We also used the simulations to make mock catalogs which we processed through the full 
pipeline that was used on the real data. Using the mock catalogs we can quantify the effect of 
sample variance on ^3. The errors are expected to be asymmetric: when Uq is large, skewness is 
small, and the sample variance in skewness is also small. So if one measures a large skewness, one 
can strongly rule out a high value of Qq. But if intrinsic skewness is large, its sample variance is 
also large. So measuring a small skewness does not rule out a low unless one has a very large 
field and good statistics. 

3.2. Simulated Convergence and Shear Maps 

Using our N-body simulations we generated two sets of maps. One is a set of idealized k 
convergence maps, from which we can easily measure noise-free statistical quantities. The second is 
a set of 7 shear maps sampled at the actual 2-D galaxy positions of the survey, which are processed 
through the same pipeline as the actual observed Virmos-Descart data set. The redshift weights 
were taken to be the statistical average, so this does not take into account the source-lens-clustering 
effect (Hamana et al. 2002a; Bernardeau 1998). Comparing the results from the two procedures 
allows us to cross-check our analysis, and adds confidence to the interpretation of the complex 
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statistical procedure in terms of simulated viewable dark matter maps. 

The convergence k is the projection of the matter over-density S along the line of sight 6 
weighted by the lensing geometry and source galaxy distribution. It can be expressed as 

<d) = ^^o w{z)Six,e)dx. (2) 

X is the comoving distance in units of c/Hq. The Hubble constant is parametrized by Hq = 
100 h km/s/Mpc. The weight function W{z) is determined by the source galaxy distribution 
function n{z) and the lensing geometry. For a flat universe, 

/•oo 

Wiz) = (1 + z)x{z) J n{z,)[l - x{z)/x{zs)]dzs. (3) 

n{z) is normalized such that n{z)dz = 1. For Virmos-Descart, we adopt the distribution shown 
in figure 1, which is not easily fit by an analytic function. 

During each simulation we store 2D projections to the midplane of the dark matter overdensity 
5 =. 5p/p through the 3D box at every light crossing time through the box along each of the x, y and 
z axes. Our 2D surface density sectional maps are stored on 2048^ grids. After the simulation, we 
stack sectional maps separated by the width of the simulation box, randomly choosing the center 
of each section and randomly rotating and flipping each section. The periodic boundary condition 
guarantees that there are no discontinuities in any of the maps. 

We then add these sections with the weights given by onto a map of constant angular size, 
which is generally determined by the maximum projection redshift. To minimize the repetition of 
the same structures in the projection, we alternatively choose the sectional maps of x, y, z directions 
during the stacking. Using different random seeds for the alignments and rotations, we make 40 
maps for each cosmology. Since the galaxy distribution peaks at z ~ 1, the peak contribution of 
lensing comes from z ~ 0.5 due to the lensing geometry term. Thus a maximum projection redshift 
z ~ 2 is sufficient for the lensing analysis. We integrate the maps for the Q.q = 1, 0.4, 0.3 simulations 
to z = 2 and obtain 40 maps each with angular width 0^ = 4.09, 3.18 and 3.02 degrees, respectively. 
To make maps for = 0.2 which are large enough to fit the survey fields, we truncated the integral 
at to z = 1.8 and obtained maps with angular width = 2.86 degree. We show one k map from 
an J7o = 0-4 simulation in fig. 2. At this resolution, the skewness is quite obvious. The challenge 
is to recover the skewness from a noisy observation of such a data set. 

Prom the sectional maps, we also produce shear maps. These 7 maps are used to produce mock 
galaxy catalogs to check the data analysis pipeline and to calibrate the results. For each sectional 
n maps, we computed sectional 71 and 72 maps using FFT's. The periodic boundary conditions 
on each section allow us to circumvent the boundary condition issue. Then by the same projection 
procedure, we make 40 71 and 72 maps with the same fields of view. We sample these simulated 
shear maps at the actual galaxy position from the Virmos-Descart catalog, which ensures that the 
mock catalogs have same masks and weights. Since the Virmos-Descart catalog consists of four 
fields, we can make ten full sets of mock catalogs from each simulation. 
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Fig. 2. — A K map in the N-body simulation of a l^o = 0.4 ACDM cosmology. The map width is 
3.18 degree and has 2048^ pixels. The scale is in units of k. The skewness of the distribution is 
apparent. Decreasing the cosmological density while maintaining the same variance forces structures 
to become more linear, and thus more skewed. 
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With these maps, we simulate the Virmos-Descart survey. Our goal is to measure 6*3 of 
smoothed k fields and its dispersion and distribution in simulated survey catalogs to provide cal- 
ibrations for the data analysis. In this paper, we fix the smoothing function as the compensated 
Gaussian function with a filter radius Of = 5.374 arc min. The maps we obtained above are non- 
periodic after the projection. In order to eliminate the edge effect, we cut 10% from the four 
margins of each smoothed k map. For the noise-free maps, the Ss parameter is defined as 

^3(^/) = (4) 
We used the k maps to fit the dependencies of ^3 on Oq. We parametrized the relation as 

S3 = an-^ (5) 

The best fit values for a and /3 are 

a = 66.0 ±2.3, /3 = -0.64 ± 0.03, (6) 

which are similar to those obtained from second order perturbation theory with different window 
functions. 



4. Analysis 
4.1. Optimality 

The optimal estimation of power spectra for Gaussian random fields is well understood. It 
requires a weighting of the data by the inverse of signal plus noise correlators. This procedure 
generally costs of order 0{N^) where N is the number of data points, about one million for our 
catalog where we have two observables, 71,72 per galaxy. Clearly the optimal procedure is pro- 
hibitively expensive. In principle rapid multi-scale iterative algorithms allow an analysis in 0{N) 
(Padmanabhan et al. 2002). The cost prefactor in that algorithm is still large, and we decided 
to use the two point correlation function instead, whose computational cost is 0{NlogN) with a 
small prefactor. 

Computing the two point correlation function is in general not an optimal estimator of the 
power. Being a quadratic estimator, it weights each data bin by only its local noise. Each correlation 
function bin Qi is an inverse noise weighted estimator of the data A for a bilinear form Qf. 

q' = A^N-^QiN-^A (7) 

while an optimal estimator should have used {S + N)~^ as its weights instead. In the limit that 
signal to noise is small, the correlation function is optimal, similarly it is optimal if the signal and 
noise covariance matrices commute. For weak lensing surveys, on small scales signal to noise is 
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small, while on larger scales the coverage is reasonably uniform, so a correlation function is not a 
very poor estimator. 

To discuss the accuracy of the higher moments of a slightly non-Gaussian random field, consider 
the one point statistic of a filtered convergence field. The intrinsic distribution of R is described by 
P^. We observe a noisy realization thereof, R° = R + ly, with ly denotes a noise random variable. Its 
variance is = (t| + a^. Typically the variance is noise dominated, in which limit the fractional 
accuracy of measuring when is known is 



where n is the number of independent samples. On the scales of a few arc minutes where this is 
best measured, signal to noise is below unity, and an accurate measurement is done statistically 
where one can reduce the noise due to the large number of arc minute sized patches in the survey. 

The skewness is even harder to measure. One expects 



The last term is of order unity. Comparing with (8) we notice three losses: 1. The numerical 
prefactor increases. 2. The error scales as a higher power of signal to noise. 3. The denominator 
has increased powers of the converge variance, which is itself a small number. These trends suggest 
that higher moments are increasingly difficult to measure in noisy data. 

Optimality of measuring the three point function is a challenging problem. Gaussian fields 
have zero three point function, so its distribution is intrinsically non-linear and must be measured 
either through high order perturbation theory or in simulations. By optimality of an estimator we 
mean one which minimizes the variance. The variance of the three point function is related to the 
six point function, which is a complex problem. 

We thus scale back our ambitions, and concentrate on the skewness in a smoothed density 
field. In analogy with the two point function, we expect that an inverse noise squared weighting 
should be a reasonable procedure in this noise dominated measurement. 



We first review the efficient computation of two point functions as a warm up exercise for 
the three point. Computationally, we start with the two shear estimates for each galaxy, and its 
noise estimate. The weight for each galaxy is the inverse noise variance. The noise variance is 
the quadratic sum of the intrinsic ellipticity variance crj, and the noise variance due to the PSF 
correction ag- The computation of erg is described in Van Waerbeke et al. (2002): the galaxy 
catalogue is divided into cells of 30 galaxies each in the size-magnitude space. For each cell, the 




(8) 




(9) 



4.2. Two point functions 
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r.m.s. of the ellipticity correction gives an estimate of a^- We arbitrarily put a lower bound of 0.28 

for the effective noise of any galaxy, as we do not want to give too large a weight to any individual 
object. Varying this cutoff does not seem to have significant effects on the results. The resulting 
weight per galaxy is therefore: 

_ 1 

" max(0.28,(7e)2- 

Galaxies are mapped onto a grid using the cloud-in-cell algorithm (Hockney &; Eastwood 1988). 
The same is done for the weights. We now have three gridded quantities, which we fast Fourier 
transform. The autocorrelation of the weights is the inverse Fourier transform of the weight grid 
times its complex conjugate. Similarly, we obtain the two autocorrelations of the shear components, 
and two cross correlations, we denote them ^jj. If the dimensions of the grid are a factor of two 
larger than the spatial extent of the galaxies, the convolution on the grid does not introduce spurious 
boundary conditions. 

The correlation functions are then multiplied by the angular weights, to yield two raw angular 
averaged correlation functions 

|+(r) = i- J[^u{r,e)+C22{r,e)]de 

l-(r) = ^ / [^n{r, 9) - ^22{r, 9)] cos(40) + [i^r. 0) + 9)\ sm{W)d9 (11) 
and an angular averaged weight correlation The final correlation is then 

e± = ^. (12) 

This allows a rapid computation of the two point function in eight Fourier transforms, each 
costing 2.5Ariog2(iV) operations, where N = AuxUy are the dimensions of the discretized grid, 
and the factor of 4 comes from having to use a Fourier grid of double the size to deal with the 
non-periodic boundary conditions. On a PC with the optimized Intel IPP or MKL libraries, the 
machines can sustain over a gigaflop on a 2 Ghz processor, so using even very fine grids Ux ~ 10, 000 
takes a matter of seconds. 

The variance of the noise is uncorrelated between bins of the two point correlation function. 
One can see that the covariance of the two point function is a four point function. For a Gaussian 
field, that can be expanded in terms of two point functions. When the noise is white, a four point 
function only expands into a non-zero quantity when the four points have two coincident pairs. 
This is not possible when cross correlating two different correlation lags, and the covariance is thus 
zero. The error due to noise in each bin of the correlation function is 4/^^,. 

In tensor notation, the two point function can be written as a product of functions which 
depend on the magnitude of separation r, and angles of the unit separation vector x 

Ca&cd(x) = (7a6(0)7c<i(x)) 
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e+(r)-e-(r) 
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{SocSm + SadSbc - Sabred) + "^-{r) 




4.3. Three point functions 



The three point function is computationally more complex. We describe here the algorithm 
we implemented which computes it in 0{N^ log ) operations for N galaxies. We first describe the 
mathematical enumeration of the three point function which is prohibitively expensive. We then 
describe a faster procedure which computes the same quantity in a shorter time. 

At every point we measure two components of the shear. The shear three point function has 
eight components. To reduce the complexity of the problem, we express the components of the 
shear in the coordinate system of the line connecting the first two points. We define the first point 
to be at the origin, and the second point along the x-axis at some distance r. The third point 
can be anywhere in the upper half plane. For each galaxy, we first label it point 1. We search 
rightward, and consider the set of all pairs formed from point 1 and a rightward neighbor, which we 
successively label as point 2. For each pair 1-2, we consider all galaxies with x coordinates larger 
than that of galaxy 1, and y coordinates above the line connecting point 1 to point 2. This results 
in a unique counting of triangles. As for the two point function, we multiply the shear estimates 
by the weight. We then compute the raw three point functions for the weighted shear, as well as 
for the weights. The final three point function is the quotient of the two. 

At face value this would appear to be an 0{N^) operation, which is prohibitively expensive 
for ~ 10^ galaxies which we currently have in each field. Analogous to particle-mesh simulations, 
we map all galaxies onto a two dimensional grid using a chaining mesh. This results in a linked 
list of galaxies residing in each cell, and costs 0{N) operations to construct. This allows us to find 
galaxies within some locus without searching the whole list. 

We enumerate all pairs of galaxies. Each pair has a separation and an angle of the line 
connecting them relative to the x-axis. We grid the list of all pairs onto a four dimensional grid 
labeled by the position of the first galaxy, and their separations and angles. The grid of separations 
are chosen logarithmically. The angles run in the range (— 7r/2, 7r/2), since we only need to count 
each pair once. For each separation and angle, we convolve this list of pairs with the gridded field 
of galaxies. We accumulate the three point function by averaging over the angles. In this process, 
we use only the pair-singlet cross correlation at lags with Ax > and y values above the line 
describing the first pair. This way all triangles are counted exactly once. 

In practice, gridding a Virmos-Descart field with 100,000 galaxies on 512^ cells, using 40 
logarithmic separations and 20 angles takes about 15 minutes on the CITA GS320 alpha server 
which has 32 alpha processors running at 731 Mhz. The procedure requires 15GB of memory to 
store the four dimensional list of pairs. 

The final three point function is defined on a three dimensional grid. In the intermediate 
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pair enumeration we used polar coordinates to save storage and reduce computational cost. Since 
each pair is tagged by the coordinate of the left galaxy, the three point function has some angular 
smearing (in our case 9 degrees), so that large relative errors can accrue when galaxies 2 and 3 are 
nearby to each other but far from galaxy 1. This can be overcome by mapping x to —x and repeating 
the procedure. For the windowed skewnesses these skinny triangles do not contribute much, and 
we neglect this effect. In principle one could construct a three point function in 0{N\og{N)) using 
a tree algorithm. The challenge is that the opening angle has to be chosen small since the shear 
needs to be rotated by the mean angle. 



4.4. Skewness 

We review the integration of the third moment from the three point function. A smoothed k 
map is defined as 

R{yi) = J K{^')U{\^-x.'\)(fx' (14) 
so the third moment can be expressed in terms of an integral over the three point function 

(k3(x)) = iy"K(xOK(x'OK(x''OZ^(|x-x'|)ZY(|x-x''|)ZY(|x-x'''|)d2ardVdVdV'' (15) 

= 27r y ^3{r,x)U3{r,x.)rdr(fx (16) 

where A is the area of integration. The three point function ^3 is defined with the first point at 
the origin, the second along the x axis, and the third anywhere in the plane. The window U3 is the 
overlap integral of three filter functions placed at the same three locations. Due to symmetry, it is 
sufficient to consider only the upper half of the plane. 

The integrands can be evaluated analytically for a compensated Gaussian filter (Crittenden 
et al. 2002). We choose 



This filter has zero area, and is normalized to have a peak amplitude of unity in Fourier space. 
This has the feature that the filter will only damp modes, and never amplify. Its Fourier transform 
is Z^(A;) = (A;ro)^exp + l)/2. The filter peaks at wave number = -v/2/ro. When ro is 

measured in radians, k is the spherical harmonic number I. When integrated over a flat spectrum 
with constant I'^Ci, the area in Fourier space is e^/8. One can use this window as a reasonable broad 
band Fourier power estimator. One obtains the equivalent fiat power by multiplying the variance 
by B/e^. The effective full width at half max width of the filter in Fourier space is a multiplicative 
factor of 2.34. 

The corresponding shear filter is 

e = " - ^ I r'«(/)*' = exp (l^ + 1) (18) 
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The filtered convergence field is then 

k(x) = J jij{x')Q{\x-^'\)AxiAxj(fx'. (19) 

Due to the term in front of Q, the hatted objects in the integral are polynomials, and the 
integrand contains only Gaussian integrals over polynomials, which are exactly solvable. The 
smoothed variance can be derived in terms of the tensor shear correlation (13) 

(4) = J Ucd{y^)Wabcd{y^)(fx (20) 



for the window function 

(fx' 



Wa6cd(x) = ^^^^^2^4 j x'^x'^{x'^-Xc){x'd-Xd)ex.^ 



x'^ + {x' - xf _^ 2 



2rl 



4(27r)2r4 

^{^achd + KdhcYt - "^iSacXbXd + ^adXhXc + ^hcXaXd + hdXaX<!)rl + XaX^XcXd 



28r^7r 



x^ 



xe^P(-i;2+2j (21) 

where we have dropped all trace terms which do not contribute. 
For the skewness we can similarly write 

{k^) = 2'k J Ubcdef{r,^)Wabcdef{r,^)rdr(fx (22) 

where the corresponding window function Wabcdef is an integral described below. 

Since the filter (17) is the Laplacian of a Gaussian, convolving by it is equivalent to first taking 

the Laplacian of the k field, and convolving that by a Gaussian. The Laplacian of the k field is 
also a local second derivative of the shear field, for which a purely local decomposition of the shear 
field into E and B mode results independent of survey geometry. We thus expect minimal coupling 
between E and B modes from survey geometry. 

To construct the analogue for the three point function, it is convenient to define the vectors 
which point from each vertex to the sum of the other two vertex vectors q = x' + x" , q' = x" — 2x' 
and q" = x' — 2x". They are depicted geometrically in Figure 3. 

The non-vanishing terms in the integral are 

Wabcdef = (27r)22337rl0 i^^^b'i'Me^f 

+3 X 4:5acqbQWeQf + cyclic permutations) 
+9{6acSbe + SaeSbc)qWf + (12 pcrmutations) 
+9 X 2{5acSbd + ^adhc)qeQf + (3 cycUc permutations)] 
x'2 + x"2 + (x"-x')2T 



X exp 



(23) 
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Fig. 3. — three point configuration: we place one vertex of the triangle at the origin, the second 
along the x axis, and the third in the upper half plane. The q vectors are the axes along which 
each shear component needs to be projected. 
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To check for systematics, one can compute the moments of the B mode skewness (k^) by a 45 degree 
rotation, corresponding to (71,72) — > (72)— 7i)- Similarly one can get the moments {ReR\) and 
{r\Rb)- When we substitute the three point window (23) into three point integral (16), we place 
x' along the x-axis, requiring only a one-dimensional integral, and only integrate x" on the upper 
half of the plane, since we only count each triangle once. The sum over the 6 indeces only has 8 
independent components, corresponding to the two shear components at each of the three vertices. 

In analogy to the two point function, we can compute the expected noise variance in each 
windowed skewness. The covariance between bins is again zero for white noise, since the only 
non-trivial six point function consists of three pairs. As for the two point function, we measure 
the noise from the variance between bins in the three point function, and assume that the noise 
is linearly proportional to the inverse square root. We verified the procedure by sampling four of 
the simulated images in the = 0.2 simulation on a regular 400^ grid with equal weights, and 
computing the skewness from the three point function. For these 2.86 degree images this skewness 
agreed with the image skewness to better than ten percent on the 5.37 arc minute scale, which is 
consistent with the variation expected from the different weighting of the boundary. 

5. Results 

We show the measured skewness in Figure 4. The variance in the same filter from the two 
point function is shown in Figure 5. We find that statistical errors from galaxy noise are small, but 
on all scales we see some B mode. The B mode is more significant in the variance measurement 
than in the skewness. This is to be expected, since variances are always positive, while skewness 
may cancel. 

After having obtained these statistics, we were primarily concerned by systematic errors, of 
which the B mode is a potential diagnostic. In our arbitrary binning, the scale at 5.37 arc minutes 
had the smallest B mode in the variance, so we proceeded with all further analyses using this 
scale. At this scale we find = 1.06 ± 0.06 x 10~^. These errors are Gaussian and contain only 

contributions from noise. Similarly, we find a variance at the same scale of = 6.30 ± 0.62 x 10"^ . 
To model the possible effects of the B mode, we added the observed B power to the error bar, and 
subtracted the measured B power from the signal. Our adopted value for subsequent analysis is 
= 5.32 lb 0.62 ± 0.98 x 10"^, where second error bar is the magnitude of the B mode. 

The B-mode may or may not contribute to the skewness, which depends on the parity of the 
source of B modes. The measured value is consistent with zero, so we neglect it in the skewness 
interpretation. Since variances are always positive, one expects any spurious source of systematics 
to increase the observed variance. Any corrections due to a observed B mode would also correct 
the observed signal down. For the skewness, the effect could have either sign. We therefore did not 
correct the (k^) values. 

Using the skewness and variances we can construct ^3. The actual value of the B variance is 
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Fig. 4. — Skewness of the shear field. The lower panel shows the negative values for any variable 
which goes below zero. The boxes with error bars are the E mode. The stars are the B mode, the 
circles are EB^ and the crosses are E^B. The error bars reflect the scatter expected in the absence 
of a signal, and does not include signal sample variance. 
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Fig. 5. — Variance of the shear field. The right side label is offset by a factor of 8/e^ which gives the 
conversion to flat bandpower. The boxes are the lensing E mode minus the non-lensing B mode, 
while the crosses are the non-lensing B mode. The error bars include the statistical error from shot 
noise plus the amplitude of B. The solid curve is the log- linear Fourier space window function used 
for the current analysis, which corresponds to a scale of 5.37 arc minutes. The vertical dotted line 
marks this scale. The dashed curve is the ensemble average obtained from 40 simulated maps in 
the 0,0 = 0.4 model. 
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added to to the error bar of the variance. The noise errors are Gaussian since they arise from the 

mean of a very large number of data points. In the absence of a better model, we also treat the 
systematic error as Gaussian. The resulting S3 is shown in Figure 6. Random variables which are 
quotients of Gaussians, such as S3, don't generally have well defined moments. The error bars in the 
plot are obtained by taking one sigma on each of {R^) and {k?), and plotting the observed change 
in 53. For sums of Gaussians this is like summing the standard deviations, which is always an 
overestimate of the actual standard deviation of the sum. One can consider the error bars in figure 
6 to be a conservative estimate. For our scale of 5.37 arc minutes, we used a second error metric by 
Monte-Carlo sampling of Gaussian noise. Nominally, 5*3 = 375, and the Monte-Carlo sample finds 
the median and percentile spreads are ^3 = 3761^25 

at 68% confidence, and ^3 = 376^111 at 95%. 
Again, these error bars do not include sample variance, which depends on cosmological models and 
require N-body simulations to calibrate. 

In principle, the fit (5) gives us a value of Oq, which would be a very small value, indicating 
that the data prefers a low density universe with no practical lower bound. The error bar and 
confidence interval is a tricky task, since it requires knowledge of the area and geometry of the sky 
patch surveyed. In principle, it should scale as the inverse square root of the number of independent 
patches as given by Equation (9). At the scales of the filter size of 5.37 arc minutes the survey is 
quite irregular, and it is not easy to quantify the effective area of sky observed, nor the correlation 
between patches. So to measure the scatter we use the noise-free mock catalogs from simulations. 
We process these catalogs through the same three point function simulation pipeline. Even though 
the simulated catalogs have no noise, we used the same weights as the noisy data. This allows us 
to separate the error from sample variance from the error from noise. 

In figures 7 and 8 we show the results for a fio = 0.4 model obtained through the pipeline. 
The points are the average from ten mock catalogs, and the error bars are the standard deviation 
across the ten catalogs. The mode separation for the variances is typically better than 1%, while 
skewness estimates have crosstalk at the 10% level. The plots also include the ensemble averages 
from the 40 direct idealized k maps, processed through an independent analysis which shares only 
the N-body simulation output, which is a useful cross-check of the pipeline. The fit from Peacock &; 
Dodds (1996) is also plotted as the dotted line, which provides a second reference for the resolution 
of the simulation. We note good agreement for scales larger than 1 arc minute, but below which 
the simulations are limited by resolution. 

While the error bars in figures 4 and 5 only accounted for the noise, the simulation plots show 
only the sample variance. The sample variance is clearly decreasing at smaller scales, where more 
independent patches are used. At scales below one arc minute, the simulations are probably limited 
by resolution. 

To estimate the sample variance errors, we used a sample of forty simulated galaxy catalogs. 
This number is limited by the expense of computations. The sample consisted of our four simulation 
models with ten catalogs for each model. We scaled the skewness and variance in each model by 



-20- 



1000 - 







-1000 - 



1 



1000 

n 1 1 — I — I — I — I — 1 



"1 r 



J I I I I L 



10 



Tq (arcmin) 



Fig. 6. — 5*3 for the data, where we used — B"^ as the estimator for k^. We added the ampHtude 
of the B mode to the error bar, which is described in the text. 
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Fig. 7. — Skewness in a noise free = 0.4 simulation processed through ten mock catalogs and 
the entire pipeline. Actual galaxy positions and weights were used. The symbols have the same 
meaning as in figure 4, with the points denoting the average over ten mock catalogs, and the 
error bar the standard deviation. The solid curve is the ensemble average from the k maps. The 
agreement is a cross check on the analysis pipeline. We note that the estimate of the standard 
deviation has an error of 47% with the 10 mock catalogues that were used. 
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Fig. 8. — Variances in the same simulation as figure 7. Tlie mode separation works better for the 
variance than for the skewness, and the mock catalogs have significantly smaller sample variance 
on than for !<? . The solid curve is the same as the dashed line in figure 5, which is the ensemble 
average from the k maps. The dotted line is the fit from Peacock h Dodds (1996). 
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the inverse of the standard deviation of the same quantity obtained from the simulated maps. This 
set of normalized samples was used to Monte Carlo sample 10000 instances of and k?. For a 
given trial value of wc fixed to the observed value, and used from Equation 5. The 
normalized samples are scaled to these constraints, and we randomly draw from the 40 simulated 
values. We add noise drawn from a Gaussian with the appropriate variance. This allows us to form 
a value for for each sample. We find the confidence bounds using a uniform Bayesian prior. 
For Q.Q = 0.25 we found 20% of the samples to have 5*3 larger than the observed value, while for 
= 0.5 only 10% of the samples hasd Sz larger than observed. For VLq = 1 such large values of 
^3 were never found. We note that our limited simulation sample only provides a crude estimate 
of the errors. It is probably not meaningful to derive 95% statistics from this small sample whose 
wings are non-Gaussian. The preliminary interpretation is that a low value of VLq is favored. More 
detailed systematic numerical studies are in progress. No practical lower bound on VLq is derived, 
since below fio = 0.2 the empirical fit from Equation (5) has not been calibrated. 

We compared the sample variance in the simulated maps to that of the mock catalogs. For 
the scale of interest, we found the scatter in from the mock catalogs shown in figure 8 consistent 
with that of a 8.5 sq degree R map, which is the effective area sampled by our survey. The scatter 
in K'^ from the mock catalogs was significantly larger than in a 8.5 sq degree contiguous map, 
possibly closer to what one sees across 3 sq degree maps. This suggests that the inverse noise 
variance weights are not optimal to estimate skewness. Unfortunately with the limited number of 
mock catalogs, the error on the sample variance is large, and more simulations are needed. To test 
the resolution of the simulations, we also compared the results for a simulation with half as many 
particles in directions on a grid twice as coarse. In the VLq = 0.3 model, the resulting 6*3 on the R 
maps changed by less than one percent. The variance and skewness separately changed by about 
10% and 20% respectively, and one would expect the changes to be significantly smaller compared 
to a higher resolution simulatio. We thus expect that simulation resolution should not be a major 
source of error on these scales. 

We also examined the variance between the four Virmos-Descart fields. At the scale of 5.37 arc 
minutes, the fields F14, F02, F22 and FIO had R^ of 8.7 x 10"^, 1.1 x 10-^ -5.1 x lO"'^, 1.1 x 10"^ 
respectively. Their individual values of were 3.0 x 10-5 , 7.0 X 10-5 , 7.5 x lO^^ , 7.0 x 10-^ without 
any subtraction oi a B mode. This scatter is consistent with that seen in the mock catalogs. One 
could in principle use the observed scatter to model the sample variance, but with only four fields, 
the error in the scatter is -^2/3 which is very large, so we used the larger sample of simulated mock 
catalogs. 

Several other factors also affect the skewness interpretation. Source-Lens-Clustering (Hamana 
et al. 2002a; Bernardeau 1998) for example systematically reduces the observed skewness, which 
would strengthen the inferred bounds on VLq. Using Table 3 from Hamana et al. (2002a), the 
closest model to our data is probably CI, which for a non-evolving source distribution and the 
closest comparable window size of 10 arc minutes results in a 16% effect. 
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The current analysis is limited by sample variance, which will improve dramatically with 
the CFHT legacy survey. The error on the error estimate is not yet well determined, requiring 
more simulations to quantify. Errors in the source redshift distribution also effect the results. 
These effects are expected to be small compared to the sample variance. Work on this issue is 
in progress, and we expect an order of magnitude improvement in the near future. The precise 
numerical computation of skewness is also a challenge, as different groups measure slightly different 
normalizations for the variance (White & Hu 2000; Jain et al. 2000). For the current data, the 
differences are not substantial, but need to be tightened with larger and more accurate galaxy 
catalogs. Currently only one angular scale was used, which is one with the smallest B mode 
contamination in the variance measurement. Even on this optimal scale, the B mode will soon 
become a limiting factor when the survey area is increased. 



6. Conclusions 

We have presented the first direct measurement of dark matter skewness using weak gravita- 
tional lensing in the Virmos-Descart survey. We chose the compensated Gaussian with a scale of 
5.37 arc minutes for all analyses, which had the smallest B mode, consistent with zero. At this 
scale, the skewness of the dark matter is = 1.06 ± 0.06 x 10~^. We measured the variance 
= 5.32 ± 0.62 ± 0.98 x 10"^. 

This implies 6*3 = 3751*^ '^24- Calibrating to mock catalogs from N-body simulations, we find 
f^o < 0.5 at 90% and < 0.25 at 80% confidence. The errors are dominated by sample vari- 
ance. The CFHT legacy survey will increase the sky area by a factor of 20, which should result 
in a decrease of a factor of 4 in the error in a year's time. Better constraints could also be ob- 
tained by going to smaller scales, where sample variance is smaller. This requires higher resolution 
simulations, faster three point analysis, and a better understanding of the origin of the B mode. 
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